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1. INTRODUCTION 

Agarwood oils are the essential oil extracted from Agarwood trees from the Aquilaria species which 
belong to the genus of Thymelaeaceae family. Resin is a type of volatile chemical substance that permeates 
the heartwood of Agarwood trees [1]—[4]. This high resin formation resulting for a good quality of Agarwood 
oils [2], [5]-[7]. Agarwood oil is one of the most useful oils with a major use for everything from perfumery, 
medical industry, fragrances and also religious ceremonies [8]—[11]. It has been applied in wide areas where 
contribute to their popularity in essential oil worldwide market including Malaysia. 

Technically, the quality of essential oils had been manually evaluated and graded using sensory 
evaluation based on their physical properties [12]-[14]. According to human perception and experience, an 
essential oil with the highest quality has a lot of resin, a dark oil color, a powerful odor, and a long-lasting 
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scent [4], [5], [10]. However, various people may have different impressions and conclusions regarding the 
technique. The sensory evaluation method is somewhat inaccurate. There is no guarantee that grading 
essential oils by human sensory evaluation would ensure its purity or quality. Due to the continuous process 
when dealing with a large number of samples at once, the human trained grader technique has a significant 
disadvantage in terms of objectivity and consistency, leading to a labor-intensive and time-consuming 
procedure [12], [15]-[17]. 

Numerous approaches have been proposed and used to verify the quality of essential oils using 
intelligent methods [13]—[15], [18]-[20]. There is a platform where Agarwood oil quality classification can 
be done solely based on their chemical subtances allowing essential oils to be classified into their respective 
classes (low, medium low, medium high, and high quality) by using today's data analysis technology. The 
accurate result can be measured by using statistical work. The highlight of this paper is the boxplot 
(also called box-and-whisker plots) analysis of Agarwood oils as well as its outlier labelling when dealing 
with the raw data. The abundance of Agarwood oil chemical compounds will be the input for the boxplot 
analysis. This introduction is imperative to highlight the objective of this study which is to present data 
distribution by using boxplot analysis on the Agarwood oils chemical substances. 


2. THEORETICAL WORK 

An easy way to interpret this research is to use images or visuals that describe the results more 
precisely. Boxplot analysis is one of the visuals used in joint display [21]-[24]. Visualization using joint 
display possibly provide structure for comparing many input data between its group [25]. Boxplot has 
become efficient tool in the industry standard for summarize the observation value, lowest quartile, median, 
highest quartile, greatest observation value and outliers in one diagram [26]-[28]. 

Figure 1 demonstrates the elements in boxplot visualization. The red ‘+’ sign symbol represent an 
outlier which also known as extreme value and located above or below the whisker. The 25" percentile of the 
lowest data is the lowest quartile (Q1) while 75" percentile compute from the data is the highest quartile 
(Q3). There is a red line in the middle of the box which indicate the sample median. Minimum and maximum 
is the range values in the sample data [28]-[30]. 
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Figure 1. The MATLAB overview of elements in boxplot [29] 


Primarily, the shorter the whiskers, the more uniform the distribution of data. The result will be 
accepted if 50% of the median is in the group [28], [31]. The whiskers are calculated based on the 
interquartile range (IQR) as mathematically shown in (1)-(3) [31]. 


IQR = Q3-Q1 (1) 


Pre-processing technique of Aquilaria species from Malaysia for four ... (Siti Mariatul Hazwa Mohd Huzir) 


154 m) ISSN: 2302-9285 


Where Q3 indicates the highest quartile to be minus with lowest quartile (Q1) to get the IQR. 
Besides, the minimum and maximum range of dataset are (2) and (3): 


Minimum range = Q1 — 1.5 (IQR) (2) 
Maximum range = Q3 + 1.5 (IQR) (3) 
3. METHOD 


In this section, it is explained the research chronological and at the same time is given 
the comprehensive technique used for data acquisition. Method can be presented in the form of Pseudocode, 
tables and others that make the reader understand easily. The research method can be made in several 
sub-sections. 


3.1. Sample acquisition 

The Agarwood oil samples were prepared by Forest Research Institute Malaysia (FRIM) and 
Bioaromatic Research Centre of Excellence (BARCE), Universiti Malaysia Pahang Al-Sultan Abdullah 
(UMPSA). The targeted species are focused to only Aquilaria Malaccensis species but with varies oil 
volume, origin and age. FRIM and BARCE use gas chromatography-mass spectrometry (GC-MS) apparatus 
to obtain their chemical substances [9]. There are 660 samples starting from low, medium low, medium high 
and high different qualities of Agarwood oils. The eleven important compounds as tabulated in Table 1 have 
been used as input and the grades of Agarwood oil has been used as an output to the classification system. 


Table 1. Data samples based on quality classification 


11 chemical compounds Grades Value Percentage (%) 
y-Eudesmol, ar-curcumene, B-dihydro agarofuran, Y- Low 209 31.91 
cadinene, a-agarofuran, allo aromadendrene epoxide, Medium low 90 13.74 
valerianol, a-guaiene, 10-epi-x-eudesmol, B-agarofuran, Medium high 30 4.58 
and dihydrocollumellarin High 326 49.77 


3.2. Pre-processing technique 

The first step to create boxplot is sorting the data samples into four groups which is low, medium 
low, medium high, and high quality. There are two columns involve where the x-axis listing the name of 
eleven important compounds of Agarwood oil (independent variable) and the y-axis is the abundances of 
important compound in percentage (dependent variable). The boxplot's performance was then evaluated. The 
flowchart of the experimental analysis in Figure 2 was employed to implement the distribution analysis on 
Agarwood oil sample. This paper focus on boxplot analysis of four oil qualities. 


Data Input and Output 
(low, medium low, medium high and high quality) 


Pre-processing 
(Boxplot analysis) 


Summarize: comparing all selected median of 
abundances (%) of chemical compounds from 
high quality with other qualities. 


Figure 2. Flowchart of experimental analysis 
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4. RESULTS AND DISCUSSION 

This section will summarize the success of boxplot analysis that support the research objective. The 
principal component analysis (PCA) filtered 660 data samples and 11 chemical substances to determine the 
important compounds. These 11 chemical substances became the inputs to boxplot. All samples are placed in 
the column using MATLAB software. 

Figure 3 describe the distribution of chemical substances based on high grade using boxplot graphical. 
The whiskers for 11=ọ-cadinene are very small between each other but contain some outliers that locate very far 
from the central tendency of each feature. The IQR for 1=10-epi-ọ-eudesmol, 2=a-agarofuran, 3=@-eudesmol, 
4=B-agarofuran, 5=dihydrocollumellarin, 6=ar-curcumene, 7=valerianol, 8=$-dihydroagarofuran, 9=a-guaiene, 
and 10=allo aromadendrene epoxide is larger as shown by the size of box shape. The data distributed uniformly. 
Generally, the boxplot in Figure 3 can be conclude that only 11=@-cadinene have a very small and 
indistinguishable IQR while the others chemical substances have large IQR. 
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Figure 3. The boxplot of Agarwood oil chemical substances for high grade 


The descriptive statistic could give more informative summaries over hundreds or thousands of “raw 
numbers” from previous data collected. The output of data will be summarize based on the mean satisfaction 
data, minimum, and maximum sample and also its standard deviation as shown in Table 2. As can see in the 
Table 2, the average percentage values for top 7 chemical substances are around 0.01 to 0.99. 


Table 2. Distribution abundance chemical substances for high grade 
7 chemical compounds _ Statistic _ Abundance percentage (%) 


10-epi-y-eudesmol Min 0.5269 
Max 0.9900 
a-agarofuran Min 0.5198 
Max 0.9900 
y-Eudesmol Min 0.0100 
Max 0.9900 
B-agarofuran Min 0.2800 
Max 0.9250 
dihydrocollumellarin Min 0.0100 
Max 0.9250 
ar-curcumene Min 0.0100 
Max 0.9900 
valerianol Min 0.0100 
Max 0.9668 
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The comparison between four qualities which is high, medium high, medium low and low analysed 
as shown in Figure 4. The highlight of this line graph is 10-epi-ọ-eudesmol and ọ-eudesmol discovered to be 
important chemical substances with median range between 0.4% to 0.8%. It was declared to be important 
chemical substances since all of the four grades has its own median value compared to others. Then, the 
highest median which is 50% of IQR range is belong to B-agarofuran for high grade with 0.99%. 
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Figure 4. The median value of the Agarwood oil chemical substances for Aqualaria species 


From Figure 5, referring to the further analysis, it can be seen that the range for highest quartile (Q1) 
of 10-epi-~-eudesmol and @-eudesmol is between 0.4—0.81. For 10-epi-ọ-eudesmol, the highest Q1 is high 
grade with 0.8102% then followed by medium low, medium high and high grade with 0.7578%, 0.5320%, 
and 0.4961%, respectively. Next, chemical substance of @-eudesmol, it can be seen that the highest Q1 is 
medium low grade followed by high, low then lastly medium high grade with value of 0.7764%, 0.7755%, 
0.6549%, and 0.5833%, respectively. Lastly, the highest Q1 which is 75% compute from the data is belong to 
B-agarofuran for high grade with 0.99% value. These two significant compounds (10-epi-g~-eudesmol and 
-eudesmol) confirmed to be the most important out of others nine compounds. 
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Figure 5. The highest quartile value of the Agarwood oil chemical substances 
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5. CONCLUSION 

The research work in this paper has successfully achieved the objective by analysing the Agarwood 
oil of Aquilaria species boxplot between four grades (low, medium low, medium high, and high). Boxplot 
method was chosen because it is able to show the shape of distribution, data variability, and its significant 
value. The boxplot method with its median, the upper and lower quartiles of the range produce accurate 
differentiations between type data of samples of the Agarwood essential oil four grades. Moreover, the 
visualization graph is the most suitable and excellent technique to describe samples quality classification. 
The input is the abundance of eleven significant chemical compounds which are y-eudesmol, ar-curcumene, 
B-dihydro agarofuran, Y-cadinene, o-agarofuran, allo aromadendrene epoxide, valerianol, a-guaiene, 
10-epi-x-eudesmol, B-agarofuran and dihydrocollumellarin, and the output is the grades of Agarwood oil. 
Overall, it is can be summarized that the boxplot and graph give the results of 10-epi-y-eudesmol and 
y-eudesmol as important chemical substances for future analysis with suggestion into five and six grades. 
Chemical compound of B-agarofuran also recommended to be considered in future work since it gives high 
quartiles values for medium high quality. 
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